Goto

Collaborating Authors

 specific instruction


ChartHal: A Fine-grained Framework Evaluating Hallucination of Large Vision Language Models in Chart Understanding

arXiv.org Artificial Intelligence

Large Vision-Language Models (LVLMs) have recently demonstrated remarkable progress, yet hallucination remains a critical barrier, particularly in chart understanding, which requires sophisticated perceptual and cognitive abilities as well as rigorous factual accuracy. While prior work has investigated hallucinations and chart comprehension independently, their intersection remains largely unexplored. To address this gap, we present ChartHal, a benchmark that features a fine-grained taxonomy of hallucination scenarios in chart understanding, along with a human-validated dataset of 1,062 samples. Our evaluation shows that state-of-the-art LVLMs suffer from severe hallucinations on ChartHal, including proprietary models such as GPT-5 and o4-mini, which achieve only 34.46% and 22.79% accuracy, respectively. Further analysis reveals that questions involving information absent from or contradictory to charts are especially likely to trigger hallucinations, underscoring the urgent need for more robust mitigation strategies. Code and data are available at https://github.com/ymcui/ChartHal .


Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing

arXiv.org Artificial Intelligence

Text-based editing diffusion models exhibit limited performance when the user's input instruction is ambiguous. To solve this problem, we propose $\textit{Specify ANd Edit}$ (SANE), a zero-shot inference pipeline for diffusion-based editing systems. We use a large language model (LLM) to decompose the input instruction into specific instructions, i.e. well-defined interventions to apply to the input image to satisfy the user's request. We benefit from the LLM-derived instructions along the original one, thanks to a novel denoising guidance strategy specifically designed for the task. Our experiments with three baselines and on two datasets demonstrate the benefits of SANE in all setups. Moreover, our pipeline improves the interpretability of editing models, and boosts the output diversity. We also demonstrate that our approach can be applied to any edit, whether ambiguous or not. Our code is public at https://github.com/fabvio/SANE.


Top 15 Best Technology Trends 2022

#artificialintelligence

There will be a lot of technology trends to follow in 2022 and they will affect different industries and different aspects of our lives. A lot of these trends are already at play, but some will take time to come to the fore. In this blog, we will look at these trends and predict what they might mean for our lives in the years to come. Technology is a fast-paced industry and there is a lot of buzzes. Everyone is talking about the latest trends, from the latest app to the latest technology.


Incredible footage from augmented reality glasses shows how they are helping engineers work

Daily Mail - Science & tech

Augmented reality glasses could soon replace repair manuals. Fieldbit, a technology company based in Mountain View, has developed an AR application targeted at engineers and field repair specialists that will place instructions for how to operate machinery and repair malfunctioning industrial equipment directly into one's field of view. The technology allows an engineer to see a live feed from the glasses of a worker on the ground and place specific instructions into the environment to guide them through a maintenance or repair procedure. Fieldbit's AR software will give detailed instructions to field workers on site. For routine procedures, companies can record the instructions and spatial information into a database so future employees can access the information at any time.


Learning at Scale & The End of "If -Then" Logic. – archieai – Medium

#artificialintelligence

In 2001, a group of Physicists were awarded the Nobel prize in Physics for creating an experiment that produced the Bose Einstein Condensate(BEC). The BEC is a state of Matter in an extremely cold state, close to absolute zero(that is, very near 0 K or 273.15 C), first theorized by Satyendra Nath Bose and Albert Einstein in 1925. In the 2001 Noble Prize winning experiment, the physicists created the first BEC in a lab by shooting multiple lasers at Gas particles from different directions. After meticulous calculations and planning, they carefully calibrated a series of lasers to achieve this.


Demystifying Machine Learning Part 1

#artificialintelligence

This is the first of a three-post series on machine learning. Is machine learning just a fancy word for the same old computer programming we have employed for decades? Or, is machine learning a mystical computer that can learn anything? More importantly, why does it matter to your business? The most effective way to define machine learning is to compare it with traditional computer programming.


Mechanical Chess Player

Classics

I don't say "beat" its designer; I say Let us assume that the machine cannot analyze the position right out and that it must make judgments. The problem, then, becomes that the machine must form its own criteria for judgment, and, if it is to beat its designer, it must form better judgments than the designer can put into it. Can we build such a machine? The problem that faces the designer is the same as that of the father who is not a good chess player and who wants his son to become world champion. Obviously, he must be very careful about what he teaches the boy.